General Disclaimer 


One or more of the Following Statements may affect this Document 


• This document has been reproduced from the best copy furnished by the 
organizational source. It is being released in the interest of making available as 
much information as possible. 


• This document may contain data, which exceeds the sheet parameters. It was 
furnished in this condition by the organizational source and is the best copy 
available. 


• This document may contain tone-on-tone or color graphs, charts and/or pictures, 
which have been reproduced in black and white. 


• This document is paginated as submitted by the original source. 


• Portions of this document are not fully legible due to the historical nature of some 
of the material. However, it is the best reproduction available from the original 
submission. 


Produced by the NASA Center for Aerospace Information (CASI) 



164000-1 3-P 


W1SA E85'10057 


j Eighth Type II Quarterly Status 

and Technical Progress Report 

STUDY OF SPECTRAL/RADIOMETRIC 
CHARACTERISTICS OF THE THEMATIC 
MAPPER FOR LAND USE APPLICATIONS 


21 June 1984 — 20 September 1984 

WILLIAM A. MALILA 
MICHEAL D. METZLER 


OCTOBER 1984 


Comma NAS5-27346 
NASA Goddard Spaa; Flight Cep ter 
Green belt Road 
Greenbelt, Maryland 20771 



(E85-10057 NASA-CB-174229) SlUEi Of 
SPLCT EAL/RAD1 OMET BIC CHAfc AC I EFISTI CS OF THE 
THEMATIC MAPPER fCR LAUD DEE APPLICATIONS 
(quarterly Status Tochcicdl Progress Report, 

21 Jun. - 20 (Environcentdl Research Lust. G3/43 


V IRONMENTAL 


7 73 , 

RESEARCH INSDTIflfE OF MICHIGAN 




BOX 061S • ANN ARBOR • MICHIGAN 48107 


N85-1 6268 


(Judas 

00057 




wamrnmammmmm 


TECHNICAL REPORT STANDARD TITLE PAGE 


1. Report No. 

1 64000-1 3-P 


2* Government Accession No. 


3. Recipient s Catalog No. 


4. Title and Subtitle 

Study of Spectral/Radiometric Characteristics 
of the Thematic Mapper for Land Use Applications 


5. Report Date 

October 1984 


6. Performing Organisation Code 


7. Author!*/ 

William A 


Mali la and Michael D. Metzler 


8. Performing Organisation Report No. 

1 64000- 


9. Performing Organization Name and Address 

Environmental Research Institute of Michigan 

P.0. Box 8618 

Ann Arbor, MI 48107 


10. Work I’n t No, 


11. Contract or Grant No. 

NAS5-27346 


12. Sponsoring Ar.enry Name and Address 

National Aeronautics and Space Administration 
Goddard Space Flight Center 
Greenbelt , MD 20771 


13, Type of Report and Period Covered 

Quarterly Status and 
Technical Progress 
21 June - 20 Sep 1984 


IT. Sponsoring Agency Code 


15. Supplementary Notes 

Mr. Harold Oseroff (Code 902) is serving as Technical Officer and 
Mssrs. Brian Markham and James Irons (Code 923) are serving as Science 
Representatives for NASA/GSFC. 

167 Abstract ~ 

Progress during ERIM's eighth quarter of effort under the Landsat-4 
and 5 Image Data Quality Assessment program for the Thematic Mapper is 
described. 

Analyses of landsat-5 TM radiometric characteristics were performed. 
Effects which had earlier been found in Landsat-4 TM data were found to 
be present in Landsat-5 data as well, including: 

1) Scan-di rection-related signal droop 

2) Scan-correlated level shifts 

3) Low-frequency coherent noise 

Coincident Landsat-4 and 5 raw TM data were analyzed, and band-by- 
band relationships between the two sensors were derived. 

Earlier efforts which developed an information-theoretic measure of 
multispectral information content were continued, comparing TM and MSS 
information content. 


17, Key Words 

Radiometric Calibration 
Landsat 4, Landsat 5 
Thematic Mapper 
Noise 

Information Theory 


18. Distribution Statement 

Initial distribution is listed 
at the end of this document. 


19. Security Classif. (of this report/ 

UNCLASSIFIED 


20. Security Classif. (of this page) 

UNCLASSIFIED 


21. No. of Pacos 

33 + iii 


22. Price 



INFRARED AND OPTICS DIVISION 


Report No. 1 64000-1 3-P 


Eighth 

Type II Quarterly Status 
and Technical Progress Report 
21 June 1984 - 20 September 1984 

for 

Study of Spectral/Radiometric Characteristics 
of the Thematic Mapper for Land Use Applications 


under 

Contract NAS5-27346 
with 

NASA Goddard Space Flight Center 
Greenbelt Road 
Greenbelt, Maryland 20771 


Submitted by 

Environmental Research Institute of Michigan 
P.0. Box 8618 
Ann Arbor, Michigan 48107 


Prepared by: 

fo& 



William A. Mali la 
Principal Investigator 


Approved by: 



Robert Horvath 

Manager, Information Processing 
Department 


October 1984 


INFRARED AND OPTICS DIVISION 


2?rjm 


Table of Contents 


1. OBJECTIVE 1 

2. TASKS 1 

3. STATUS AND TECHNICAL PROGRESS 1 

3.1 PROBLEMS 

3.2 ACCOMPLISHMENTS 

3.2.1 Landsat-5 TM Noise and Droop Effects 

3.2.2 TM Landsat-4 vs Landsat-5 Radiometric Comparison .... 

3.2.3 Information-Theoretic Comparison of TM and MSS Data 

3.3 SIGNIFICANT RESULTS 

3.4 PUBLICATIONS AND PRESENTATIONS 

3.5 RECOMMENDATIONS 

3.6 FUNDS EXPENDED 

3.7 DATA RECEIPTS 


APPENDIX A 20 

DISTRIBUTION LIST 33 


PRTXT.DI”'' 


'i 


iii 




INFRARED AND OPTICS DIVISION 


2?RJM 


Eighth Quarterly Report 

STUDY OF SPECTRAL/RADIOMETRIC CHARACTERISTICS 
OF THE THEMATIC MAPPER FOR LAND USE APPLICATIONS 


1. OBJECTIVE 


The objective of this investigation is to quantify the performance of the TM as 
manifested by the quality of its image data in order to suggest improvements in 
data production and to assess the effects of the data quality on its utility for land 
resources applications. Three categories of this analysis are: a) radiometric effects, 
b; spatial effects, and c) geometric effects, with emphasis on radiometric effects. 


2. TASKS 


Four tasks have been established to address the above objective. The first 
three are to study radiometric performance, spatial performance, and geometric 
performance, respectively, while the fourth is to study spectral characteristics. In 
keeping with the identified objective, the radiometric performance study is our major 
task. 


3. STATUS AND TECHNICAL PROGRESS 


During this eighth quarterly reporting period, a more in-depth analysis of 
Landsat-5 TM radiometric characteristics was performed. Scan-direction-related 
‘droop' effects and scan-related level shifts were found and examined using both 
nighttime data and a relatively uniform scene of daytime data. Coincident 
Landsat-4 and Landsat-5 TM data were compared, and band-by-band correlations 
were established for the values prior to radiometric correction. Earlier efforts which 
developed an information-theoretic measure of information content in multispectral 
data were continued and extended. 
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3.1 PROBLEMS 

Non-receipt of radiometrically and geometrically corrected data from the 
coincident-coverage scene of Landsats-4 and 5 has precluded a complete comparison 
of TM data from the two satellites. 

Definitive quantitative analyses of noise effects were hampered by the fact that 
the night scene (5-0052-02182) data we received were incomplete. In Quadrant 3, 
Lines 36-52 were missing from Band 1, as was Line 102 from Band 1, Quadrant 
4. 


3.2 ACCOMPLISHMENTS 

Accomplishments in three technical areas are described below. 


3.2.1 Landsat-5 TM Noise and Droop Effects 

Earlier efforts by the authors! 1,2, 4, 5, 9] resulted in characterization of noise 
effects in Landsat-4 TM which had not previously been reported. Three types of 
noise were reported: 

1. An effect related to scan-direction was discovered, whereby the mean 
signal level (in reflective bands) was observed to decoy (‘droop’) as the 
active scan progressed during daytime scenes, and similarly to rise as a 
function of time in nighttime scenes[2,4]. 

2. A shift of the mean signal of several adjacent detectors in Band 1 
upward or downward for one or more scans was first reported by 
KiefTert3). We found that the effect was not limited to Band 1, and 
provided the initial characterization of this offect[4,5]. This included 
discovery that all level shifts were strictly correlated among the affected 
detectors, with two basic level shift patterns being present in all 
detectors (Bands 1-5,7) to varying degrees. Odd and even detectors 
were generally 180° out of phase with one another, one set shifting up 
when the other set shifted down. The two patterns were characterized 
by Band 1 Detector 4 and Band 7 Detector 7, respectively. 

This effect has been examined by several other investigatorsfe.g., 
3,7-9]. Correction mechanisms have been proposed[4-9], and in the case 
of the Canada Centre for Remote Sensing (CCRS), implemented '.t an 
operational radiometric correction processing system[7]. 

3. A low frequency (approximately 400Hz or 262-264 pixel wavelength) 
coherent noise was detected in Bands 1-5 and 7[9]. This low amplitude 
(<0.75DN) effect was observed in nighttime data only. 

The tools developed for Landsat-4 TM noise analysis were applied to Landsat-5 
TM data. Those analyses revealed artifacts in Landsat-5 TM data which 
correspond to the three types of noise described above. Only raw, uncorrected data 
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were available for analysis, but the effects aro expected to be found in corrected 
data as well, if they follow the puttern we found in Landsat-4 TM data. 

Descriptions of those findings are provided below; in-depth annlysis is planned for 
the next reporting period. 

Within -Line Droop . Both nighttime reflective band data and data from 
relatively homogeneous daytime scenes were used to analyze Landsat-5 TM data for 
the presence of the within-line droop effect discovered in Landsat-4 TM. Scene 
5-0052-02182 (Harrisburg, PA) provided the nighttime data, while the daytime data 
were from Scene 5-0014-15460 (Alabama). The daytime scene, although not ideal 
in terms of spatial and spectral homogeneity, had the advantage of being coincident 
with Landsat-4 TM Scene 4-0608-15463, thus allowing direct comparison of effects 
in the two sensors. 

The data examined revealed a ‘droop/rise’ effect in Landsat-5 TM data which 
appeared nearly identical to that observed with Landsat-4. The magnitude of the 
Landsat-5 effect appears to be somewhat less than for Landsat-4 TM, but the 
direction (nighttime ‘rise’ and daytime ‘droop’), and time constants (approximately 
800-1000 pixels) appear very similar. Figure 1 illustrates this effect for Band 4 of 
both Landsat-4 and Landsat-5 Thematic Mappers. Apparently the causal 
mechanism was not removed by the modifications made to Landsat-5 TM prior to 
launch. 

Scan — Correlated Level Shifts. Scene 5-0052-02182 provided the radiometrically 
uniform scene data essential for optimal extraction of scan-correlated level shifts. 
Unfortunately, the missing data problems mentioned above with regard to this 
particular scene hampered the analysis of the level shift effect also. 

Figure 2 illustrates the scan-line mean signal returned by each of the detectors 
in Band 3. The level shifts observed in the Landsat-5 TM data appear to fit a 
single pattern as opposed to the two patterns identified for Landsat-4 TM. The 
Landsat-5 level shifts may be characterized by the detectors in Band 3, and for this 
reason are identified as Type 5-3 level shifts by Barker[8]. Although all Band 3 
detectors exhibit this level shift behavior, it may be found in nearly all the detectors 
in all the reflective bands to some extent. As with the level shifts found in 
Landsat-4 TM data, some of the detectors shift with a phase directly opposed to the 
phase of the prototype (Band 3) detectors. These phase relationships are seen most 
clearly in Bands 1, 5, and 7, where, in general, the odd numbered detectors shift in 
phase with the Band 3 detectors, and the even-numbered detectors have a level 
shift pattern 180° out of phase with Band 3 shifts. 

Low-Frequency Coherent Noise. Low-frequency (approximately 400Hz) coherent 
noise was observed in Landsat-4 TM nighttime reflective data. This noise was seen 
to be of low amplitude and present in all non-thermal bands. Preliminary analysis 
of the one nighttime Landsat-5 TM scene we have indicates that this noise is 
present in Landsat-5 TM data as well. Figure 1 illustrates this noise (along with 
the nighttime ‘rise’) for Band 4 of both sensors. The approximately 260-pixel 
period is clearly present in these data, even though the Landsat-5 data had only 
minimal filtering applied. The low amplitude (<0.02DN) of this noise would prevent 
it from being observed in daytime data if the effect were additive. Previous 
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examination of daytime Landsat-4 TM data did not reveal any discernable effects of 
this coherent noise. 


3.2.2 TM Landsat-4 vs Landsat-5 Radiometric Comparison 

As Landsat-S was moving to its WRS orbit after launch, an opportunity was 
present to acquire near-simultaneous Landsat-4 and 6 TM data over Alabama. 
Analysis of the raw data can provide an indication of the radiometric consistency 
and linearity of the two sensors; analysis of the radiometrically corrected data 
provides the information necessary to use the two sensors together or 
interchangeably. Our current analyses were restricted to the raw, radiometrically 
uncorrected data only, because the corrected data tapes have not been received. 

Approximately 30 regions were selected from Scene 4-0608-15463 which 
spanned the scene dynamic range in each of the seven spectral bands. These 
regions were then extracted from the Landsat-5 data (Scene 5-0014-15460), and 
region mean signal values were calculated for each band of each sensor. 
Band-by-band plots of these data revealed a high degree of linearity and correlation, 
also indicated by R 2 values of >0.995 in all cases. The relationships between 
Landsat-4 and Landsat-5 TM data over the dynamic range present in these scenes 
are illustrated by Figures 3(a)-(g) for Bands 1-7, respectively. In general, the gains 
of Landsat-5 TM Primary Focal Plane bands (1-4) are slightly greater than for 
Landsat-4, and the Landsat-5 Cold Focal Plane bands (5-7) have lower gains. Band 
6 of Landsat-5 has a much lower gain than Landsat-4 in these uncorrected data, 
possibly due to different shutter reference temperatures in the two sensors. 

The actual regression coefficients have limited value since these relationships 
were developed for radiometrically uncorrected data. The radiometric correction 
process, in providing the conversion of signal counts (DN) to radiance, would 
presumably remove the differences between the two sensors, i.e., the regressions 
would each have unity gain and zero offset. Examination of radiometrically 
corrected data will demonstrate the success of the process in achieving that goal. 


3.2.3 Information-Theoretic Comparison of TM and MSS Data 

In the sixth quarterly report on this contract 11], an information-theoretic 
measure of the information content of multispectral scanner data was derived and 
applied to Landsat TM and MSS data. The effort was continued during the current 
quarter, with some additional analysis results. The additional analyses are 
summarized in this section. Appendix A contains a paper, describing the overall 
effort, which was prepared for presentation at the Eighteenth International 
Symposium on Remote Sensing of Environment and for publication in the 
proceedings of that symposium. Simple numerical examples to help one understand 
entropy concepts were generated and are in the Appendix, but will not be repeated 
here. 


Data- Space Descriptions. In the previous report, graphs were presented of the 
quantities that compared information capacities and data-set characteristics. The 
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diagram in Figure 4 helps describe the various terms used to designate spectral 
data-space characteristics, while Table 1 quantifies observed values for several cases 
that were considered. First, the system-design capacities of the Landsat-4 TM and 
MSS are presented, in terms of the number of bits transmitted to the ground and/or 
recorded on computer-compatible tapes (CCTs). The system design capacity is the 
sum of the available bits in the various spectral bands (equal to the logarithm to 
the base two of the product of the maximum number of digital levels in those 
bands, hence a “volume”.) For TM, the number of bits recorded on CCTs is the 
same as that transmitted (8 bits/channel). For MSS, however, the six-bit 
telemetered data are expanded to seven bits on the CCTs, with only an apparent 
gain of information. Comparisons involving MSS are given for both seven-bit data 
(since that is the form in which we received them) and data after a degradation to 
six bits was performed. The greater information potential of the TM system design 
(reflective bands), as compared to the MSS system, is quantified as 48 vs 24 bits in 
telemetered data. 

Figure 4 also portrays the "hypercube" volume or data-space volume spanned 
by multispectral data. These volumes are computed by summing the bit equivalents 
of the observed data-value ranges (max - min + 7 ) in each band being considered. 
Upon comparing the fractions of their total data-space volumes that are spanned by 
data from the agricultural scene, one observes that the TM data fall nine bits short 
of capacity, while the MSS data fall approximately six bits short of capacity. 

Actual data dispersion volumes (see Figure 4 and Table 1) were found to be 
substantially smaller than the “hypercube" volumes. Results for both real and 
synthetic data shown in Table 1 represent the actual information content of the 
data. (Note that these values for actual information are substantially smaller than 
reported elsewhere for similar comparisons in which the “hypercube” volumes are 
treated as the information content[12].) The number of observations analyzed 
established a maximum limit on each entropy value. The concentration of multiple 
observations (pixels) into individual spectral colls reduces the information content 
below the potential maximum. The data sets described in Table 1 show very little 
tendency for TM pixels to cluster, due to the very large system capacity, spectral 
diversity, and fine gradation of the TM bands. The MSS data show definite 
tendencies for multiple observations in spectral cells. 

Table 1 also shows that the TM data represent 3.3 bits more information than 
the MSS sensor data, with approximately two bits being associated with spatial 
resolution (pixel size and number) and the remainder with spectral bands and 
radiometric resolution. Since the synthetic data have the same number of 
observations for both TM and MSS, they can be considered to have equal spatial 
resolutions. Thus, the 2.2-bit difference must be due to their spectral and 
radiometric properties. 

Noise. Noise in multispectral data was not considered explicitly in the results 
presented herein. Sensor noise effects certainly were present in the real Landsat 
data and natural variations of crop observations were present in both real and 
synthetic data. Noise can add variance to signals and increase the number of 
spectral cells occupied (above that for no noise), thereby creating an apparent 
information content greater than the true information content of ideal, noiseless 
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signals. One might address such effects by applying nn appropriate quantization 
factor (greater than unity) to each band to reduce the number of discrete levels 
present in data sets, and computing the reduced information content. 

Summary Discussion. An information-theoretic measure has been defined and 
applied to Landsat multispectral data, both real and synthesized. Examples of the 
basic concepts also were generated. The measure does quantify signal dispersion 
patterns, independently of class membership and distributional assumptions. It also 
provides an alteri ;,te method of measuring the extent to which subsets of bonds or 
transformed variables represent the total pattern. In planning analyses and 
interpreting results, however, analysts should insure that data sets being analyzed 
are representative of the problems under consideration. 

A number of observations were made from this initial study. The 
system-design information capacity of TM is much greater thon that of MSS. The 
potential information capacities and the signal "hypercube" volumes of agricultural 
data were much larger than the information actually represented by signal 
dispersion patterns in the sets of data values analyzed. For an agricultural data 
set, the gain in information content of TM over MSS was 3.3 bits, far less than 
the difference in design capacities. Tasseled Cap transformations preserved the 
information in original bands and offered a modest savings in bits over those 
original bands, a fact which might be useful in data compression approaches. There 
were extremely few multiple occurrences of spectral observations in the TM data 
set, but a reasonably high number in the MSS data, another indication of TM's 
finer partitioning of spectral space. For the “best” combinations of variables, 
entropy magnitudes were more a function of the number of variables than of the 
type of variables (original bands or transformed). TM had greater entropy values 
for Brightness variables and Brightness-Greenness pairs than did MSS. Information 
in the Tasseled Cap Third Component of TM was much greater than that of MSS, 
both by itself and in combination with Brightness or Greenness, confirming TM's 
greater dimensionality. 

In future studies, it is recommended that additional data sets be analyzed, both 
with larger sample sizes and with varied scene content; effects of other 
transformations might also be examined. Noise effects should be investigated 
through use of quantization factors to degrade radiometric resolutions. It may also 
be fruitful to investigate approaches to incorporate class membership into 
information-theoretic measures of multispectral information content. 


3.3 SIGNIFICANT RESULTS 

(1) Three noise effects present in Landsat-4 TM data were also found in 
Landsat-5 TM data. 

(2) Scan lines were missing from one raw data tape (unit RLUT); if the 
same effect is present in other tapes, it could cause problems for 
investigators who are not aware of it. 

(3) Radiometric comparisons were established between raw TM data from 
coincident scenes of Landsat-4 and Landsat-5. 

(4) Additional information-theoretic comparisons of Landsat TM and MSS 
data were made. 
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3.4 PUBLICATIONS AND PRESENTATIONS 

A paper, entitled "Information Theoretic Comparison of Original and 
Transformed Data from Landsat MSS and TM", by William A. Malila, was 
prepared for presentation at the Eighteenth International Symposium on Remote 
Sensing of Environment, Paris, Prance, October 1084. It will be published in the 
symposium proceedings. A preprint is included as Appendix A. 


3.5 RECOMMENDATIONS 

No additional major recommendations beyond those made in previous reports 
are identified at this time. 


3.6 FUNDS EXPENDED 

A total of approximately $39,000 was expended during the three months June 
through August 1984. An amendment to the contract was received to support 
additional analyses of Landsat-5 data. The cumulative spending through August 
represents approximately 65^o of the amended contract total. Expenditures during 
the period 1*20 September 1984 are not included in this percentage value. 


3.7 DATA RECEIPTS 


Raw data tapes (unity RLUT CCT-AT) and calibration data tapes (CALDUMP) 
were recevied during this quarter for the following scenes: 


Alabama 
Alabama 
SE Alabama 
Harrisburg (Night) 
White Sands 


P20/R37 

P20/R37 

P20/R38 

P111/R212 

P33/R37 


5-00 14-15400 

4- 0608-15403 

5- 0014-15463 
5-0052-02182 
5-0129-17075 


Note: Scene 5-0052-02182 was missing lines 36-52 from Quadrant 3 Band 1, 
and line 102 from Quadrant 4, Band 1 on the Unity RLUT CCT-AT. 

Fully corrected data (CCT-PT) were received for two scenes: 

NE Alabama P20/R36 5-0014-15454 

Alabama P20/R37 4-0608-15463 
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9 







LANDSAT— 5 SIGNAL 


INFRARED AND OPTICS DIVISION 




LANDSAT— 5 SIGNAL 


INFRARED AND OPTICS DIVISION 



LANDSAT-4 SIGNAL 


Figure 3(b). RELATION.. ' BETWEEN LANDSATS-4 AND 5 TM DATA 

BAND 2 * 




INFRARED AND OPTICS DIVISION 


< 

o 

if) 

in 

I 

b- 

< 

if 

o 

2: 

< 



LANDSAT-4 SIGNAL 


Figure 3(c). RELATIONSHIP BETWEEN LANDSATS - 4 AND 5 TM DATA - 

BAND 3 


13 


LANDS 




LANDSAT-4 SIGNAL 


Figure 3(d). RELATIONSHIP BETWEEN LANDSATS-4 AND 5 TM DATA 
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Table 1. Information Comparison for MSS and Six -Band TM Data Sots 


A. VALUES FOR REAL AGRICULTURAL DATA (N. CAROLINA SCENE) 




MSS 


SIX-BAND 

TM 

TM GAIN 
(BITS) 



NUMBER 

BITS 

NUMBER 

BITS 

SYSTEM CAPACITY SENSOR 

0 . KxlO 8 

24 

0.28xL0 15 

48 

24 


CCT 

0 27xl0 9 

28 

Q.28xIO L5 

48 

20 

HYPERCUBE VOL. SENSOR 

O.UxlO 6 

18. 7 

0,43xl0 12 

38.6 



CCT 

0 32xL0 7 

21.6 

0.43xl0 12 

30,6 


DATA DISPERSION PATTERN: 







0 Observations, H max 

3,468 

11.8 

13,015 

L3 , 7 

1 9lj (Spatial) 

MSS i 

CCT \ 

7 Bits 

• 0 Unique ceils 
\ Entropy, H 

2,898 

11.4 

12,903 

13.7 

12,27,1 (Tocal) 

fiZd ( 

^ Loss due co speccro- 

radlometrlc concentration 

0.38 

* 

0 02 

i 0.361 (Spec/Ra4iora) 

MSS j 
Sensor 1 
6 Bits 

i ^ Unique cells 
) Entropy , H 

1,730 

10.3 

12,903 

L3. 7 

Q 34i (Total) 

per 

Bond 1 

[ Loss due to spectro- 

radiomecric concentration 

I . 45 

- 

0.02 

[ I. 43| (Spec/Radtom) 


B. VALUES FOR SYNTHETIC AGRICULTURAL DATA 
(Assumes equal spatial resolution) 


’'SYSTEM' 1 CAPACITY (MSS: 6 Bits/Band) 
OBSERVED HYPERCUBE VOLUME 

DATA DISPERSION PATTERN : 

■ Observations i 

max 

• Unique cells 

■ Entropy , H 

Loss due to speccro- 
radiometric concentration 


MSS 


SIX- BAND 

TM 

TM GAIN 
(BITS) 

NUMBER 

BITS 

NUMBER 

BITS 

0. I7xl0 8 

24 

0.23xl0 U 

48 

24 

0. IGxLO 7 

20 

0.99xl0 12 

40 

20 

2,276 

11.15 

2,2 76 

11.15 


817 

- 

2,260 

- 


. 

8.94 

. 

11.14 

2 * 20 1 


2.21 

_ 

0.014 

(Spec/R i 


(TM gain over seven-bit MSS data was one bit.) 
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INFORMATION THEORETIC COMPARISONS OP ORIGINAL 
AND TRANSFORMED DATA FROM LANDSAT MSS AND TM*+ 


William A , Mali la 

Environmental Research Institute of Michigan 
Ann Arbor, MI 48107 


ABSTRACT 

A communica tions-theory approach ia 
taken to analyze the dispersion and con- 
centration of signal values in various data 
spaces, irrespective of any specific class 
memberships. Entropy, as defined by 
Shannon, is used to quantify information. 
Mutual information is used to measure th^ 
information represented by subsets of 
spectral variables. Examples of the con- 
cepts are presented. Several different 
comparisons of Information content are made. 
These include comparisons of system design 
capacities, of data volumes occupied by 
agricultural data in the spaces defined by 
original bands and by transformed spectral 
(Tarsoled Cap) variables, of the information 
contents of original bands and Taaceled Cap 
variables, and of the information contents 
of MSS and TM for the given agricultural 
data sets. 


1. INTRODUCTION 

With multispectra 1 data sets from remote sensing systems, questions 
arise as to the relative merits of individual and groups of spectral bands 
and transformed spectral variables. Classification-based measures are 
frequently used for such comparisons, as are variance-based measures such as 
principal component analysis. 

The first objective of the effort reported hero was to develop a class- 
independent and non-parametric measure of the information content of multi- 
spectral data. The second objective was to use it to analyze and compare 
data from the two Landsat-4 sensors, the Multispectral Scanner System (MSS) 
and the Thematic Mapper (TM). 


2 . METHOD 

A communications-theory approach is taken to analyze the dispersion and 
concentration of signal values in various data spaces. Entropy, as defined 
by Shannon, is used to quantify information. The process of selecting a 
subset of bands is viewed as the transmission of data through a lossy 
communication channel, and the mutual information between input and output is 


•Presented at the Eighteenth International Symposium on Remote Sensing of 
Environment, Paris, France, October 1-5, 1984. 

+This research was sponsored by the U.S. National Aeronautics and Space 
Administration, Goddard Space Flight Center, Greenbelt, MD, under Contract 
NAS5-27346 , 
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used to measuro information transfer, i.o., the information represented by 
the subset. 

The alternative measure is applied to MSS and six-band TM data of two 
types. These are real LandsatM MSS and TM data acquired simultaneously from 
an agricultural scene in North Carolina, and data v. lues synthesized from 
field-measured reflectance spectra of agricultural crc^s and soils using an 
atmospheric modol. These data were used in prior comparisons of tho spatial 
end spectral characteristics of Landsat TM and MSS data (1,2]. In tho 
synthetic data, samples are primarily from vogotat on at a variety of ground 
cover percentages, with many fewer examples off bare soil. All analyses of TM 
data are limited to the six reflective bands; the thermal band is not 
analyzed in this effort duo to its coarser spatial resolution, its dependence 
on emissive rather than reflective character it tics of scono materials, and 
lack of a simulation data base. 

Several different comparisons of information content are made. These 
include comparison of TM and MSS system-design information capacities, com- 
parison of the data-space volumes spanned by the agricultural data in the 
spaces defined by original bands and by transformed spectral (Tassolod Cap) 
variables, comparison of the agricultural information contont of original 
bands to that of transformed variables, and comparison of the agricultural 
Information content of TM data to that of MSS. 

2.1 INFORMATION MEASURE DERIVATION 

2,1.1 Basic Concepts ♦ Shannon defined self information, I(Xj), as a 
measure of the info rma t ion associated with knowing the occurrence of a signal 
state x^ which occurs with probability 


I(x t ) - 1 °82 <PXxJT J “ * lo 82 P < x i.) (bits) (1) 

The more rare the event, the greater is one's uncertainty about when it will 
occur and, consequently, the greater is the information conveyed when it is 
observed. Entropy, given the symbol H, is the value of self information when 
averaged over all N possible states of xt 


H(x) - ^ PCp 1 o 82 ^ (2) 

With two variables, the use of joint and conditional probabilities i.o 
necessary! 

H(x,y) - H ( x ) + H(y| x) (3) 

since 

P{x,y)«p(x)P(y;x> (.4) 

In computing the conditional entropy, the weighting assigned to each 
information term is the joint probability of the states involved, i.e., 


H(xly) 



i-L J-L 


PC V y j> log 2 PTx^TyJT 


(5) 


If we consider x to be tho input to a communication channel and y to be 
the output, we can define the mutual information transferred between them, 
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1.0m I M ( h i y ) , as 

I H (x;y) - H ( x ) - ll(x y) 

In words, the mutual information exchanged is the difference between H(x), 
the Information content of the input/ and H(x|y), the information loaa or 
uncertainty about x when wo are given the output y* When the total infor- 
mation la transferred, H(x*y) - 0 and I M <x;y) • H(x). At the other oxtrema, 
when y does not contain any inf orwation* relatable to x, H(x|y) * H(x) and 
therefore l M (x/y) - 0, i*e., there is no mutual information. 

Figure 1(a) presents a concise graphical summary of those quantities and 
their interrelationships* Note the spoclal cases in Figure 1(b). Figures 2 
and 3 present simple numerical examples that illustrate the concepts of 
entropy and joint entropy in a quantitative fashion. Note that entropy is at 
its maximum when all cells or states are oqually likely* It can be reduced 
by decreasing the number of cells occupiod, by having a non-uniform 
distribution or concentration of observations in the occupied cells, or by 
doing both* 


2*1.2 Multlspectral Extension . The above concepts can bo extended to 
multispoctral variables hy lotting the variables x and y become multidimen- 
sional vectors X and V, with X ■ C , x 2 , • * • * X^ ) and V ■ < Y^ , Yj , ♦ . . , Y^ 

x y 

Usually, N ^ N . The transformation achieved by the communication channel 
is used horo~in x a general sense, to reproaont both simple selections of 
spectral band subsets and more complex transformations, such as the Tasseied 
Cap Transformation. 

The following relationship was derived to compute entropy from counts of 
spectral cell populations (shown here for six variables)} 


H(X) - log 2 N obs 

Information 
if each 
observation 
were in a 
unique cell 


< W > ijkil 1082 


Information loss Hug to concentration 
of the observations into a subset of 
cells 


( 7 ) 


where 


C M . lm the count of occurrences in the cell having Level i in 
x Lovq1 ^ in etc*, 


and N b a is the total number of observations in the data set being 

analyzed. 


The entropy of x is expressed in Equation (7) as the difference between two 
terms* The first, log n * # t * 1Q maximum possible information associated 
with t..e given number of Q 8Bservationa, i.e., the information that would be 
present if each observation were unique and occupied a unique cell in the 
signal space. The second term represents the information that is lost by any 
concentration of observations into a subset of cells. 


2*1.3 Spectral Band Subsetting . The selection of subsets of spectral 
bands is a special case of the mutual information expression, 

I m (X;Y) - H ( X ) - H(XfY) 

where Y now is a subset, X 1 , of the X variables, so 


% 
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I M U;X' ) - H { X ) - H ( X * X 1 ) 

Whenever a variable, say Xp, is retained, its conditional probability term 
becomes unity, its contribution to H(XlXM is reduced to zero, and its 
information content is retained an mutual information. Whenever a variable, 
say X 9 is eliminated, there is a loss of mutual information. This loss is 
represented by the conditional entropy term through all conditional proba- 
bility components in which Xq occurs on the left-hand side of the conditional 
probability indicator lino hut not on the righthand {or given) side. 

2.1.4 Spectral Transforms . Spectral transformations were obtained bv 
applying the linear-combination Tas?jaled Cap (TASCAP) transformations to MSS 
(31 and TM (4) data. The princirjl TASCAP variables are Brightness and 
Greenness. Also, principal-component analysis was utilized to obtain a 
different set of spectral variables for one comparison. 

3 RESULTS 


; , 1 SPECTRAL DATA VOLUMES 

The diagram in Figure 4 helps describe the various terms used here to 
designate spectral data-space characteristics, while Table I quantifies many 
of the observed values. Figure 5 presents information measures for two of 
those quantities, as a function of the number of data variables. First, the 
system-design capacities of the Landsat-4 TM and MSS are presented, in terms 
of the number of bits transmitted to the ground and/or recorded on computer- 
compatible tapes <CCTs>» For TM, the number of bits recorded on CCTs is the 
same as that transmitted (8 bi ts/channel ) . For MSS, however, the six-bit 
telemetered data are expanded to seven bits on the CCTs, with only n 
apparent gain of information. Nevertheless, many comparisons involving S 
will use seven-bit data since that is the form in which we received them. 
For some others, a degradation to six bits was performed before analysis. 
The greater information potential of the TM system design (reflective bands), 
as compared to the MSS system, is quantified as 48 vs 24 bits in telemetered 
data . 

Figure 5 also portrays the "hypercube" volume or data-space volume 
spanned by TM and MSS data. These volumes are computed by summing the bit 
equivalents of the observed data-value ranges (max - min + 1 > in each band 
being considered. Upon comparing the fractions of their total data-space 
volumes that are spanned by data from the agricultural scene, one observes 
that the TM data fall nine bits short of capacity while the MSS data fall 
approximately six bits short of capacity. 

Actual data dispersion volumes (see Figure 4) were found to be sub- 
stantially smaller than the hypercube volumes. Results for both real and 
synthetic data are shown in Figure 6 for MSS (7 bits/band; CCT) and Figure 7 
for TM. (Note that these values for actual information are substantially 
smaller than reported elsewhere for similar comparisons in which the hyper- 
cube volumes are treated as the information content (5].) The data 
dispersion volumes in Figures 6 and 7 are measured by the entropies of the 
best variable combinations for the respective observation sets, and represent 
the information present in those sets. Most of the information is contained 
in the first two or three variables. The number of observations analyzed 
establishes a maximum limit on each entropy value. As shown earlier in 
Equation (7), the concentration of multiple observations (pixels) into 
individual spectral cells reduces the information content below ^he potential 
maximum. Table I shows very little tendency for TM pixels to do this, due to 
the very large system capacity, spectral diversity, and fine gradation of the 
TM bands. The MSS data show definite tendencies for multiple observations in 
spectral cells. 
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Table I shows that the TM data represent 3*3 bits more information than 
the MSS sensor data, with approximately two bitn being associated with 
spatial resolution {pixel size and number) and the remainder with spectral 
bands and radiometric resolution. Since the synthetic data have the same 
number of observations for both TM and MSS, they can be considered to have 
equal spatial resolutions* Thus, the 2.2-bit difference must be due to their 
spectral and radiometric properties, 

3*2 SPECTRAL TRANSFORMATIONS 

Figure fl compares the data-space volumes spanned by original and trans- 
formed versions of signals from the agricultural scene. It appears that a 
bit-rate reduction of about 3 bits/pixel could be achieved for this agri- 
cultural scene, without loss of infori ition (see discussions of Figures 9 and 
10), by transmitting values from the transformed variables instead of from 
the original bands. These differences might be greater for data sets with a 
broader range of scene amplitudes. 

Figure 9 compares the agricultural information content of original and 
TASCAP variables from TM and MSS for the North Carolina scene. In each case, 
the best subset of each size was used. Here again, relatively little infor- 
mation is gained by the inclusion of more than three variables. 

Figure 10 illustrates, for the synthetic MSS data set, the fact that the 
information content of original band values and two types of transformed 
variables are essentially identical. In addition to TASCAP variables, the 
information content of principal-component variables for this data set is 
also displayed* The equality of the complete sets of variables is in keeping 
with theoretical considerations of linear transformations, 

l!*3 SUBSETS OF VARIABLES 

Mutual information values for the best and worst original-band subsets 
of each size are presented in Figure 11, to illustrate the range of infor- 
mation conveyed by various subsets of the variables. The differences are 
greatest among pairs of variables for both TM and MSS, Figure 12 is a 
similar comparison for TASCAP variables. In this case, we find an even 
greater disparity between best and worst combinations, due to the decreased 
information content of the last TASCAP variables. 

3*4 DIMENSIONALITY 

Figure 13 displays information measures computed for the first three 
Tasseled Cap components of TM and MSS data from the agricultural scene. (The 
MSS data were in CCT form at seven bits/band, ) The first three components 
are individually quite similar for TM, but there is a substantial decrease 
(3.3 bits below Brightness) for the third component of MSS (Yellowness). 
This is consistent both with many investigators* experiences in finding MSS 
data of agricultural areas to be primarily two dimensional and with recent 
studies which have found a substantial amount of information in the TM 
Tasseled Cap Third Component (4], Throughout this comparison, TM values are 
greater than the corresponding MSS values, for example the TM Brightness 
value is 6,7 bits compared to 5 * 8 bits for MSS. 

When pairs of components are considered, we see substantial increases in 
total information, as would be expected with the addition of a second vari- 
able; the value for TM Brightness/Greenness is 4,3 bits greater than for 
Brightness alone, and the corresponding increase for MSS is 3.7 bits. How- 
ever, differences do appear between MSS and TM. Whereas the value of the 
Brightnesa/Greenness pair for MSS is substantially greater than the other two 
(approximately two bits greater than Greenness/Third Component), there again 
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is relatively little difference (less than 0.4 bits) among the three pairings 
from TM data, pointing to a higher dimensionality in TM. 

Three components captured the vast majority of information for both 
systems. However, the fact that the gain in going from two to three com- 
ponents was nearly as large for MSS (1.25 bits) as for TM (1.7 bits) was 
somewhat surprising in view of the previously discussed two-dimensional 
character of MSS data. Furthermore, principal-component analysis of MSS data 
showed nearly total representation of variance by the first two components. 
The MSS gain likely is due to the Brightness/Greennesa plane having a thick- 
ness of several counts in the third direction, even though this third 
component was uncorrelated with the others. The observed values also 
indicate that differences do exist among these various measures of 
mult ispect ral signal properties. The TM data pattern also may be somewhat 
planar in three space, although not aligned as well with any component axis? 
correlations with the Third Component were -0.69 for Brightness and 0.36 for 
Greenness in this data set. None of the/ffi observations should diminish the 
utility (3,4) of Tasseled Cap transforms for physical interpretation of data 
values and agricultural scene characteristics. 

3.5 NOISE 

Noise in multispectral data was not considered explicitly in the results 
presented herein. Sensor noise effects certainly were present in the real 
Landsat data and natural variations of crop observations were present in both 
the real and synthetic data. Noise can add variance to signals and increase 
the number of spectral cells occupied (above that for no noise), thereby 
creating an apparent information content greater than the true information 
content of ideal, noiseless signals. One might: address such effects by 
reducing the number of discrete levels present in data sets by applying an 
appropriate quantization factor (greater than unity) to each band and 
computing the reduced information content. 

4. SUMMARY DISCUSSION 

An inf ormation-theoret ic measure was defined and applied to Landsat 
multispectral data, both real and synthetic. Examples of the basic concepts 
also were generated. The measure does quantify signal dispersion patterns, 
independently of class membership and distributional assumptions. It also 
provides an alternate method (to classification) of measuring the extent to 
which subsets of bands or transformed variables represent the total pattern. 
In planning analyses and interpreting results, however, analysts should 
insure that data sets being analyzed are representative of the problems under 
consideration. 

A number of observations were made from this initial study. The 
system-design information capacity of TM is much greater than that of MSS. 
The potential information capacities and the signal "hypercube" volumes of 
agricultural data were much larger than the information actually represented 
by signal dispersion patterns in the sets of data values analyzed. Tasseled 
Cap transformations preserved the information in original bands and offered a 
modest savings in bits over those o f ginal bands, a fact which might be 
useful in data compression approaches. There were extremely few multiple 
occurrences of spectral observations in the TM data set, but a reasonably 
high number for the MSS data, another indication of TM’s finer partitioning 
of spectral space. For the "best" combinations of variables, entropy 
magnitudes were more a function of the number of variables than of the type 
of variables (original bands or tranformed). TM had greater entropy values 
for Brightness and Brightness/Greenness than did MS 5, Information in the 
Tasseled Cap Third Component of TM was much greater than that of MSS, both by 
itself and in combination with Brightness or Greenness, confirming TM's 
greater dimensional ity . 


In future studies, it is recommended that additional data sets be 
analyzed, both with larger sample sizes and with varied scene contents; 
effects of other transformations might also be examined* Noise effects 
should be investigated through use of quantization factors to degrade radio- 
metric resolutions. It may also be fruitful to investigate approaches to 
incorporate class membership into information-theoretic measures of 
multispectral information content. 
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Table I, 


Information Comparison for MSS and Six-Band TM Data Sets 


A, VALUES FOR REAL AGRICULTURAL DATA (N, CAROLINA SCENE) 




MSS 


S IX- BAND 

TM 

TM GAIN 
iSITS) 



NUMBER 

BITS 

NUMBER 

BITS 

SYSTEM CAPACITY SEHSOR 

0 17xl0 8 

24 

0 28xl0 15 

48 

24 



CCT 

0 27xlQ 9 

28 

0 28xl0 15 

48 

20 


KYPEHCUBE 

VOL.: SENSOR 

0 44xl0 6 

18.7 

0 43xl0 12 

38.6 




CCT 

0 32xl0 7 

21.6 

0.43xl0 12 

38.6 



DATA DISPERSION PATTERN 








t> Observations . H_._ 
sax 

- 4 Unique cells 
Entropy. H 

3.468 

LI. 8 

13.015 

13,7 

' 1. 9li 

(Spatial) 

MSS | 

CCT: \ , 

7 Bits ) 

2,898 

• 

12,903 

. 



- 

11.4 

- 

13.7 

ijzil 

(Total) 

P«r ) 

Band [ 

Lost due to spectro- 
radiomatric concentration 

0 38 

- 

0.02 

sa 

(Spec/Radiom) 

MSS [ 

d Unique cells 

1,730 

- 

12.903 

- 



Sensor \ 
6 Bits » 

Entropy, H 

. 

10 3 


13.7 

1 3 3d 

(Total) 







par } 

Band ( 

• Loss due to speccro- 

radiometric concentration 

1.45 


3 02 

(13 

(Spec/Radioa) 


B. VALUES FOR SYNTHETIC AGRICULTURAL DATA 
(Assumes equal special resolution) 



MSS 



six- band 

TM 

TM CAIN 


NUMBER 

BITS 

NUMBER 

BITS 


“SYSTEM" CAPACITY (MSS: 6 Bits/Band) 

0.17x10® 

24 

0.2Bxl0 15 

48 

24 

OBSERVED HYPERCUBE VOLUME 

0. I0xl0 7 

20 

0 99xl0 12 

UO 

20 

DATA DISPERSION PATTERN: 






4 Observations; 

UOX 

2,276 

11.13 

2,276 

11.15 


4 Unique ceils 

817 

- 

2,260 

- 


• Entropy, H 

- 

0.94 

- 

11.14 

2^20 j 

Lose due to apeccro* 

- -. 14 — L * 

- 

2.21 

- 

0.014 

(Spec/Radioa) 


radiometric concentration 


(TM gain over seven-bit MSS data vis one bit.) 
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Figure 2. Entropy Examples 
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Figure 3. Joint Entropy Examples 
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